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ABSTRACT 


Improving  human  systems  integration  through  teehnologically  advanced  training  and 
performance  aids  has  become  increasingly  important  to  military  transformation.  Measures  of 
improved  cognitive  and  coordination  processes  arising  from  the  employment  of  transformational 
tools  are  necessary  to  guide  the  refinement  and  future  development  of  such  technologies.  In  this 
paper  we  describe  a  Cognitive  Load  Theory  approach  to  developing  a  combinatory  measure  of 
individual  workload  and  team  performance  following  an  experimental  intervention  involving 
training  and  a  Decision  Support  System.  We  discuss  how  indicators  of  what  we  term  team 
decision  efficiency  can  improve  assessing  the  effectiveness  of  transformational  processes  and 
technologies. 


Application  of  Cognitive  Load  Theory  to  Developing  a  Measure  of 
Team  Decision  Efficiency 

Improving  human  performance  through  advanced  training  and  decision  aids  is  a  major 
objective  of  military  transformation  advocates.  However,  advances  are  needed  in  diagnostic 
measures  of  cognitive  and  team  coordination  processes  to  better  guide  the  design  and 
development  of  efficient  transformational  technologies.  The  Tactical  Decision  Making  Under 
Stress  (TADMUS)  program,  sponsored  by  the  Office  of  Naval  Research,  successfully 
demonstrated  that  effective  team  training  and  aiding  through  a  Decision  Support  System,  based 
on  cognitive  and  team  task  analyses,  resulted  in  better  performance,  and  with  less  individual 
mental  effort  exerted  (for  a  discussion  of  this  related  research,  see  Cannon-Bowers  &  Salas, 
1998).  The  final  TADMUS  experiment  tested  the  combined  effect  of  training  and  decision 
support,  and  a  recent  analysis  showed  that  decision-making  was  improved  through  these 
interventions  (Smith,  Johnston,  &  Paris,  2003).  In  this  paper  we  build  upon  this  body  of  research 
so  as  to  advance  diagnostic  measures  for  assessing  human  systems  integration  efficiencies  (cf 
Fiore,  Cuevas,  Scielzo,  &  Salas,  2002;  Scielzo,  Fiore,  Cuevas,  &  Salas,  2004).  Towards  that 
goal,  we  describe  and  test  a  measure  named  the  Team  Decision  Efficiency  (TDE)  score  derived 
from  Cognitive  Load  Theory  and  explored  within  TADMUS  experimentation. 

The  Team  Decision  Efficiency  measure  is  part  of  a  theoretical  framework  developed  in 
the  area  of  team  cognition  (Salas  &  Eiore,  2004)  to  understand  process  and  performance  at  the 
inter-  and  intra-individual  level  (for  a  full  discussion  see  Eiore,  Johnston,  Paris,  &  Smith,  in 
press).  The  overarching  goal  of  the  framework  put  forth  by  Eiore  et  al.  is  to  aid  in  theory 
development  by  hypothesizing  innovative  strategies  to  assess  human  systems  integration.  This 


framework  describes  how  measures  of  team  performance  can  be  simultaneously  used  in 
combinatory  analyses  with  subjectively  derived  measures  at  the  individual  level  to  examine  the 
impact  of  technology-based  aids  on  team  process  and  performance.  By  assessing  subjective 
processes  in  ways  analogous  to  those  put  forth  in  the  instructional  sciences,  Fiore  et  al.  (in  press) 
argued  that  we  can  have  a  window  into  the  manner  in  which  processes  at  the  level  of  the 
individual  interact  with,  and  alter,  processes  at  the  team  level.  In  this  paper  we  focus  on  the 
component  of  that  framework  derived  from  Cognitive  Load  Theory  (CLT)  and  test  a  measure 
derived  from  that  approach  -  the  Team  Decision  Efficiency  score.  Although  CLT  has  been  used 
for  a  number  of  years  in  instructional  systems  design  research,  its  application  to  team  decision 
making  represents  a  unique  contribution  to  human-systems  integration  in  general  and  team 
cognition  studies  in  particular. 

Cognitive  Load  Theory 

Cognitive  Load  Theory  (CLT)  has  been  the  focus  of  the  educational  and  instructional 
sciences  for  over  a  decade  (Chandler  &  Sweller,  1991;  Sweller  &  Chandler,  1994;  Sweller, 
Chandler,  Tierner,  &  Cooper,  1990)  and  its  utility  continues  to  grow  (see  Paas,  Renkl,  & 
Sweller,  2003;  Paas,  Tuovinen,  Tabbers,  &  Van  Gerven,  2003).  CLT  articulates  how  cognitive 
processes  in  working  memory  interact  with  long  term  memory  and  learning  content  and 
performance.  Sweller  (1994)  defines  learning  as  schema  acquisition  and  the  transfer  of  learned 
procedures  as  one  moves  from  controlled  to  automatic  information  processing.  As  knowledge  is 
acquired  it  decreases  the  burden  on  working  memory  (see  also  Chandler  &  Sweller,  1996; 
Sweller,  1988;  Van  Merrienboer,  &  Paas,  1990).  CLT  posits  that,  depending  upon  the  amount  of 
knowledge  already  acquired  within  a  given  domain,  learning  and  performance  can  be  altered  due 


to  the  load  imposed  by  external  faetors  (for  a  full  discussion  of  CLT,  see  Sweller,  1999;  Sweller 
&  Chandler,  1994).  Specifically,  endogenous  and  exogenous  factors  are  present  when  one 
interacts  with  an  instructional  environment.  Endogenous  factors  are  the  long-term  memory 
structures  associated  with  a  particular  learning  content  and  the  working  memory  processes  (see 
Baddeley,  1986;  1992a;  1992b)  used  when  engaged  in  a  learning  activity  (e.g..  Chandler  & 
Sweller,  1991). 

Exogenous  factors  such  as  instructional  system  design  or  training  content  interact  with 
these  endogenous  limitations  in  cognition.  Eor  example,  without  substantial,  part  task 
simulation-based  training;  the  exogenous  factors  in  learning  military  tasks  (e.g.,  flying  a  high 
speed  aircraft  while  operating  command  and  control  displays  and  coordinating  with 
crewmembers  and  other  external  aircraft)  are  of  sufficient  quantity  to  overwhelm  human 
information  processing  capacities.  Eurther,  the  exogenous  and  endogenous  factors  can  require 
either  a  high  or  low  degree  of  interaction  themselves.  Eor  instance,  in  instructional  contexts, 
CET  characterizes  the  forms  of  cognitive  load  that  result  as  intrinsic  and  extrinsic.  Intrinsic  load 
is  high  when  learning  content  requires  a  substantial  degree  of  interaction  and  involves  a  large 
number  of  cognitive  elements.  Moreover,  when  information  is  new  for  the  learner  (i.e.,  the  long¬ 
term  memory  associated  with  the  content  is  sparse),  the  intrinsic  load  is  challenging.  Extrinsic 
load  is  described  as  an  additional  (artificial)  cognitive  load  imposed  by  poorly  designed 
instruction  and  it  is  argued  to  hinder  learning  (e.g.,  Kalyuga,  Chandler,  &  Sweller,  1999). 

CET  advances  the  notion  that  analysis  of  instructional  efficiency,  which  identifies  the 
cognitive  burden  on  the  trainee  in  conjunction  with  performance,  may  increase  the  return  on 
investment  in  developing  training  systems.  Paas,  Van  Merrienboer,  &  Adam  (1994)  defined  their 
instructional  efficiency  construct  as  the  relationship  between  trainee  subjective  workload 


assessments  and  overall  task  performanee.  The  “instruetional  effieieney  seore”  is  ealeulated 
using  standardized  seores  of  subjeetive  assessments  of  mental  effort  and  performanee  (Paas  and 
Van  Meerienboer,  1993).  As  one  interaets  with  the  learning  environment,  the  burden  on  working 
memory  should  be  subjeetively  assessed  and  simultaneously  eonsidered  with  learner 
performanee  beeause  it  reveals  “important  information  about  the  eognitive  eonsequenees  of 
instruetional  eonditions  that  is  not  neeessarily  refleeted  by  traditional  performanee-based 
measures”  (Paas  &  Tuovinen,  2004,  p.  134). 

For  example,  within  multimedia  learning  environments,  eognitive  resourees  are  more 
effieiently  used  when  animations  are  presented  with  a  voieeover  than  with  on-sereen  text  (e.g., 
Kalyuga  et  al.,  1999;  Mayer,  &  Moreno,  1998;  Mousavi,  Low,  &  Sweller,  1995).  The 
simultaneous  presentation  of  animation  and  text  (referred  to  as  the  prineiple  of  redundaney  in 
theories  of  multimedia,  see  Mayer,  2001;  Moreno  &  Mayer,  1999)  may  produee  higher  eognitive 
load  due  to  overburdening  the  visuospatial-sketehpad  in  working  memory.  In  eontrast,  tapping 
separate  auditory  and  visual  ehannels  aehieves  greater  instruetional  effieieney  beeause  the 
working  memory  burden  is  redueed  —  referred  to  as  the  prineiple  of  temporal  eontiguity  (see 
Mayer,  2001).  Poorly  designed  instruetional  systems  that  violate  the  prineiple  of  temporal 
eontiguity  may  produee  low  instruetional  effieieney  seores  due  to  inereases  in  workload 
eoneomitant  with  deereases  in  performanee. 

This  brief  review  of  CLT  was  presented  within  the  eontext  of  the  Navy’s  TADMUS 
effort  beeause  the  eoneeptual  underpinnings  of  that  line  of  inquiry  into  training  and  deeision 
support  were  based  on  analogous  theories  of  eognition.  In  partieular,  the  goal  of  the  system 
improvements  was  to  make  deeisions  and  team  interaetion  requirements  elearer  and  more 
transparent  (ef  Mareus,  Cooper,  &  Sweller,  1996).  In  the  present  study,  using  a  variant  of  the 


Paas  et  al.  (1994)  instructional  efficiency  scores,  we  developed  the  Team  Deeision  Effieieney 
seore  in  order  to  determine  if  it  was  possible  to  add  a  level  of  diagnostieity  to  efforts  in  human- 
systems  integration.  This  seore  is  based  upon  subjective  assessments  of  workload  eombined  with 
objeetive  measures  of  team  taetieal  performanee.  The  speeifie  derivation  of  the  Team  Deeision 
Effieieney  is  deseribed  in  detail  in  the  Methods  seetion,  but,  generally,  it  is  derived  in  a  manner 
similar  to  instructional  efficiency.  The  primary  differenee  is  that  we  use  team  performanee  rather 
than  individual  performance  in  the  formula.  Thus,  we  label  this  team  deeision  efficiency  because 
it  is  a  eomposite  seore  derived  from  workload  seores  of  individuals  interaeting  within  a  team  and 
the  assoeiated  team  performanee  seores.  Following  the  theoretieal  framework  put  forth  by  Fiore 
et  al.  (in  press),  the  overall  hypothesis  was  that  teams  provided  training  and  a  Deeision  Support 
System  (referred  to  hereafter  as  Training/DSS)  would  perform  more  effieiently  than  a  control 
condition.  As  such,  it  is  expected  that  eompared  to  the  eontrol  group,  teams  reeeiving 
Training/DSS  would  use  more  effeetive  teamwork  proeesses  (information  exchange,  supporting 
behavior,  initiative/  leadership,  and  eommunieation),  and  this  would  lead  to  Team  Deeision 
Effieieney  seores  favoring  the  Training/DSS  eondition. 

METHODS 

Partieipants 

Partieipants  were  96  US  Navy  offieers  enrolled  in  an  offieer  training  program. 
Participants  were  primarily  males  (Male  =  93,  Female  =  3),  and  participant  rank  was  Eieutenant 
(0-3)  with  mean  years  of  serviee  at  9.6  years  (SD  =  3.8).  Partieipants  had  served  at  least  two 
tours  on  a  ship  and,  in  at  least  one  of  the  tours,  eaeh  had  experienee  as  a  Division  Head  whieh  is 
the  equivalent  of  a  first-level  manager.  Partieipants  did  not  reeeive  additional  payments  or  eourse 


credits  for  their  efforts.  One  participant  did  not  complete  the  NASA-TLX  inventory  and  was 
exeluded  from  the  analysis. 

Design  and  Task 

The  researeh  protoeol  described  next  is  based  on  previous  TADMUS  team  researeh  (refer 
to  Johnston,  Poirier,  &  Smith-Jentseh,  1998  for  details).  The  study  was  a  quasi-experimental, 
between  groups,  post-test  only  design  with  two  eonditions  (Training/DSS  vs.  Control)  deseribed 
in  greater  detail  below.  Eaeh  partieipant  was  assigned  to  a  six-person  team,  with  eight  teams  in 
eaeh  eondition.  Random  assignment  to  eondition  was  not  possible,  but  efforts  were  made  to 
ensure  team  eomposition  was  not  biased  based  on  a  partieular  speeialty  (e.g.,  engineer  versus 
eombat  systems  offieer).  To  aet  as  a  ship’s  air  defense  warfare  team,  partieipants  were  assigned 
to  one  of  the  following  roles:  Commanding  Offieer  (CO),  Tactieal  Aetion  Offieer  (TAO),  Air 
Defense  Warfare  Coordinator  (ADWC),  Taetieal  Information  Coordinator  (TIC),  Identifieation 
Supervisor  (IDS),  and  Eleetronie  Warfare  Supervisor  (EWS).  The  reporting  relationship  among 
the  team  members  was  hierarehieal,  with  the  IDS  reporting  to  the  TIC,  and  the  TIC  reporting  to 
the  ADWC.  The  ADWC  and  EWS  report  directly  to  the  TAO,  and  the  TAO  reports  direetly  to 
the  CO. 

Teams  performed  their  taetieal  decision  making  tasks  on  PC-Based  wateh-stations  linked 
together  through  a  loeal  area  network  to  form  a  distributed  simulation  training  system  named  the 
Decision  Making  Evaluation  Eacility  for  Tactical  Teams  (DEFTT)  (Johnston,  Poirier,  &  Smith- 
Jentseh,  1998).  Event-based  simulator  seenarios  were  time -tagged  to  identify  speeifie  expeeted 
team  behaviors  throughout.  All  information  was  unelassified.  Headsets  supported  verbal 
eommunieations  among  team  members  and  role  players,  and  role  players  read  from  a  seript  in 


order  to  prevent  any  deviations  from  expected  events.  All  participants  had  at  least  48  hours  of 
DEFTT  experience  prior  to  the  experiment.  Participation  in  the  experiment  was  incorporated  into 
the  participant’s  training  schedule. 

The  team  task  objective  was  to  perform  a  ship’s  air  defense  warfare  detect-to-engage 
sequence.  Team  members  had  to  interact  with  their  watch-stations  and  passed  tactical 
information  to  each  other  to  develop  an  accurate  picture  about  potentially  hostile  and  friendly 
aircraft  and  ships  “radar  tracks”  as  they  appeared  throughout  each  of  four  30-minute  scenarios. 
Teams  had  to  report  initial  detection  of  a  surface  or  air  track,  track  type  (commercial  or  tactical), 
and  priorities  for  dealing  with  the  most  threatening  contact.  Although  the  simulated  tracks  did 
not  react  to  watch-stander  actions  (i.e.,  they  were  not  intelligent  agents),  as  team  members 
changed  identification  of  specific  tracks  on  the  radar  displays,  that  information  changed  across 
all  watch-stander  displays.  If  a  threatening  track  met  specific  rules  of  engagement,  the  team  had 
to  report  plans  to  obtain  authority  to  prepare  for  the  ship’s  self  defense.  When  approved,  the  team 
had  to  execute  actions  based  on  their  pre-planned  responses  in  accordance  with  rules  of 
engagement. 

Control  Condition.  Teams  in  the  control  condition  performed  their  watch-station  tasks  on 
the  DEFTT  system.  The  TAO  and  CO  shared  a  single  Command  and  Decision  Display 
simulation  watch-station  configured  specifically  for  them.  The  TIC,  IDS,  AAWC,  and  EWS  each 
had  a  Command  and  Decision  simulation  watch-station.  In  addition,  the  EWS  had  an  early 
warning  system  simulated  watch-station.  The  research  protocol  for  this  condition  was  based  on 
the  typical  combat  training  the  officers  received  during  their  course  curriculum. 

Training/DSS  Condition.  Team  members  were  assigned  DEFTT  watch-stations  with  the 
exception  that  the  TAO  and  CO  were  each  assigned  a  DSS  (see  Morrison,  Kelly,  Moore,  & 


Hutchins,  1998  for  details).  The  DSS  operates  in  a  standalone  mode,  but  was  synehronized  to 
run  in  tandem  with  DEFTT  for  this  experiment.  The  TAO  and  CO  reeeived  a  45-minute 
eomputer-based  DSS  tutorial  that  deseribed  display  fimotions  and  allowed  point  and  eliek 
praetiee.  The  DSS  was  designed  based  on  the  eognitive  tasks  underlying  TAO  and  CO  deeision 
making  proeesses,  and  then  a  set  of  supporting  eommand  and  eontrol  displays  were  developed 
and  tested  (Morrison  et  ah,  1998).  The  resulting  display  design  on  the  PC  monitor  is  organized 
into  four  general  areas  (refer  to  Smith,  Johnston,  and  Paris  2004  for  details).  The  upper  left  side 
shows  the  taetieal  radar  symbols  with  enhaneed  shading  to  delineate  areas  of  weapons 
engagement  for  potentially  hostile  traeks.  The  upper  right  side  of  the  display  (Traek  Summary, 
Traek  Profile,  Response  Manager)  is  oriented  to  present  eritieal  information  about  a  single  traek 
(e.g.,  aireraft,  ship,  ete.)  as  effieiently  as  possible.  The  lower  right  side  of  the  sereen  (Basis  for 
Assessment  and  Comparison  to  Norms)  presents  historieal  traek  information  in  terms  of  its 
elassifieation  as  friendly,  neutral,  or  threat.  Running  from  the  lower  left  to  the  lower  right  of  the 
display  are  the  Traek  Priority  and  Alerts  List  that  present  a  prioritized  summary  information 
related  to  the  most  eritieal  traeks. 

Computer-based  training  and  videotape  presentation  were  used  to  teaeh  Deeision  Making 
and  Teamwork  Skills.  The  Deeision  Making  Skills  eomputer-based  training  (MeCarthy, 
Johnston,  &  Paris,  1998)  was  adapted  from  eritieal  thinking  researeh  (Cohen,  Freeman,  & 
Thompson,  1998),  and  other  researeh  on  naturalistie  deeision  making  and  training  (Zsambok  & 
Klein,  1997).  It  instruets  partieipants  to  understand  and  develop  deeision-making  strategies  that 
they  transfer  to  the  seenario-based,  team  training  environment.  Next,  partieipants  were 
instructed  by  eomputer-based  training  and  videotape  on  Teamwork  Skills  using  Team 
Dimensional  Training,  and  then  praetieed  identifying  speeifie  eombat  information  eenter  (CIC) 


teamwork  behaviors  together  in  the  elassroom.  Team  Dimensional  Training  was  developed  and 
validated  under  previous  TADMUS  researeh  and  later  refined  under  researeh  for  shipboard 
instruetor  training  and  support  (Smith- Jentseh,  Zeisig,  Aeton,  &  MePherson,  1998).  Next, 
partieipants  assembled  in  the  DEFTT  lab  and  an  instruetor  trained  them  on  how  to  eonduet 
struetured  after  aetion  reviews  using  the  Team  Dimensional  Training  Debriefing  Guide. 
Partieipants  praetieed  Team  Dimensional  Training  in  the  eontext  of  two  DEFTT  training 
seenarios.  The  partieipants  were  instrueted  on,  and  praetieed  using  the  DSS  as  a  replay  deviee 
following  praetiee  on  the  training  seenarios  to  highlight  eritieal  events  and  support  Team 
Dimensional  Training  diseussions. 

Dependent  Measures 

Air  Defense  Warfare  Team  Observation  Measure.  The  Air  Defense  Warfare  Team 
Observation  Measure  (ATOM)  provides  seores  on  four  dimensions  of  teamwork  behaviors: 
Supporting  behavior,  Eeadership/Initiative,  Information  Exehange,  and  Communieations 
(Johnston,  Smith- Jentseh,  &  Cannon-Bowers,  1997).  A  trained  rater,  blind  to  eonditions,  used 
team  eommunieations  transeripts  and  videotapes  to  assess  team  performanee  on  1 1  items.  Eaeh 
item  is  a  five-point  seale  with  anehors  at  eaeh  end.  A  rater  assesses  the  extent  to  whieh  a  speeifie 
team  behavior  represented  a  “real  weakness  or  strength  for  the  team.”  An  aeeeptable  internal 
eonsisteney  reliability  (alpha)  estimate  of  .79  was  found. 

Air  Defense  Warfare  Team  Performanee  Index.  The  Air  Defense  Warfare  Team 
Performanee  Index  (ATPI)  is  a  paper-based  measure  of  team  task  performanee  on  the  “deteet-to- 
engage”  (DTE)  sequenee  (Johnston  et  ah,  1997;  Paris,  Johnston,  &  Reeves,  2000).  Subjeet 
Matter  Experts  (SMEs)  established  standards  of  DTE  performanee  (timing  and  aeeuraey)  for  the 


most  critical  aircraft  in  each  of  the  four  post-test  seenarios.  Two  trained  raters,  blind  to 
eonditions,  used  team  eommunications  transeripts  to  judge  whether  or  not,  and  when,  team 
members  reported  correct  and  incorrect  DTE  actions.  Rater  agreement  ranged  between  91  and 
100  pereent,  with  an  average  agreement  of  97  percent.  A  third  rater  corrected  the  minor 
disagreements  so  that  a  single  ATPI  would  exist  for  team  task  performanee  on  each  scenario. 

Detection  (DE)  and  Planning/Execution  (PE)  scores  were  developed  as  ATPI  subseores 
to  support  diagnosis  of  team  task  performance  based  on  the  team  deeision  making  sehema  model 
by  Paris  et  al.  (2000).  Eor  the  Detection  (DE)  subscore,  teams  were  evaluated  on  their  accuracy 
and  timing  in  reporting  initial  deteetion  of  aircraft,  aircraft  type  (eommereial  or  tactical),  and 
priorities  for  dealing  with  the  most  threatening  aireraft.  An  On-time  DE  seore  is  based  upon  the 
team’s  timely  and  accurate  responses  to  all  tactical  aircraft,  across  the  four  seenarios.  A  Eate  DE 
score  is  based  upon  the  team’s  aecurate,  but  late  responses  to  all  tactical  aircraft,  across  the  four 
seenarios. 

Planning  and  Execution  actions  represent  the  activities  performed  by  the  team  after  the 
DE  sequence  (e.g.,  warning,  challenging,  and  covering  the  hostile  aireraft  with  weapons).  An 
On-time  PE  seore  is  based  upon  the  team’s  timely  and  aceurate  planned  and  executed  actions  for 
all  tactical  aircraft,  across  the  four  scenarios.  A  Eate  PE  score  is  based  upon  the  team’s  accurate, 
but  late,  planned  and  executed  actions  for  all  tactical  aircraft,  across  the  four  scenarios. 

Pereeived  Workload.  In  GET,  mental  effort  is  typically  assessed  with  a  Eikert  seale  and 
asks  participants  to  rate  pereeived  level  of  mental  effort.  Eor  example,  nine-point  scales  have 
anehors  ranging  from  “very,  very  low  mental  effort”  to  “extremely  high  mental  effort”  (see  Paas, 
1992),  and  six-point  seales  have  anchors  ranging  from  “very  easy”  to  “diffieult”  (see  Mareus  et 
ah,  1996).  Along  analogous  lines  level  of  mental  effort  was  important  to  diagnosing  the 


effectiveness  of  the  TADMUS  Training/DSS  intervention.  In  this  experiment,  a  five-item  Likert 
scale  version  of  the  NASA  Task  Load  Index  (TLX)  asked  participants  to  rate  extent  of  perceived 
mental  demand,  physical  demand,  temporal  demand,  effort,  and  frustration'  on  scales  labeled  at 
each  end  with  the  anchors  “low”  and  “high”  (Hart  &  Staveland,  1988).  An  acceptable  internal 
consistency  reliability  (alpha)  estimate  of  .95  was  found. 

Team  Decision  Efficiency  Score.  The  Team  Decision  Efficiency  Score  was  calculated 
using  an  individual  level  metric  -  that  is,  individual  levels  of  workload  (NASA-TLX),  combined 
with  the  overall  team  ATPI  scores.  Specifically,  a  given  team  had  six  separate  workload  scores, 
but  one  overall  performance  score  was  used  within  a  team.  This  method  was  pursued  for  both 
practical  and  theoretical  reasons.  First,  the  ATPI  is  designed  to  capture  team  performance  but 
there  is  no  equivalent  measure  for  team  workload.  Second,  an  argument  can  be  made  that  this 
method  allows  a  more  precise  form  of  diagnosis.  In  particular,  because  workload  is  an  internal 
state,  attempting  to  observe  workload  based  on  behaviors  is  much  more  problematic  than  doing 
so  with  team  performance.  Thus,  from  the  standpoint  of  team  cognition  (e.g.,  Salas  &  Fiore, 
2004),  combining  perceived  individual  workload  with  team  performance  allows  us  to  capture 
how  team  processes  may  be  related  to  individual  processes. 

Following  CFT,  the  Team  Decision  Efficiency  scores  were  derived  by  taking  an 
individual  team  member’s  standardized  TLX  score  and  combining  them  with  their  team’s 
respective  standardized  performance  scores.  Specifically,  because  “there  is  no  direct  method  for 
mapping  units  of  performance  on  units  of  mental  effort,  the  measures  are  converted  to 
standardized  z-scores”  (Paas  &  Tuovinen,  2004,  p.  142).  Kalyuga  et  al.  (1999)  utilized  this 

'  Because  one  of  the  items  used  in  the  NASA-TLX  is  conceptually  distinct  from  measures  of 
workload  traditionally  used  in  CLT  (Item  5  assessing  prediction  of  performance),  it  was  not 
included  in  the  overall  sums. 


approach  and  adapted  it  to  show  how  such  scores  can  be  represented  as  the  perpendicular 


distance  from  a  line  representing  a  level  of  zero  efficiency  with  the  formula  as^  _ 

V2 

As  described  by  Paas  et  al.  (2003),  the  square  root  of  two  is  used  based  upon  the  formula  for 
calculating  the  distance  of  a  point  to  a  line  (see  also  Kalyuga  et  ah,  1999  for  a  full  description  of 
the  formula’s  derivation).  Because  these  are  standardized  scores  this  results  in  positive  and 
negative  values  that  hover  around  a  mean  of  zero.  Positive  scores  indicate  relatively  better 
performance  in  proportion  to  reported  workload  whereas  negative  scores  indicate  the  opposite 
pattern  (relative  performance  was  less  than  relative  workload). 

We  used  the  Kalyuga  et  al.  formula  for  our  analyses  of  the  Team  Decision  Efficiency 
(TDE)  scores  and  this  was  calculated  by  analyzing  the  data  across  the  differing  scenarios.  Our 
interest  was  in  viewing  TDE  across  control  and  experimental  groups,  but  dependent  upon  whom 
within  a  team  was  working  with  the  DSS.  To  address  the  issue  that  only  two  of  the  six  team 
members  were  actually  utilizing  the  DSS,  we  created  a  variable  within  the  teams  so  as  to 
maintain  the  distinction  between  those  with  the  DSS  and  those  without  it.  Specifically,  we 
divided  each  team  into  two  sub-teams  based  upon  their  roles,  that  is,  those  roles  within  the  team 
using  the  DSS  in  the  Experimental  Condition  versus  those  roles  not  using  the  DSS.  The  variable 
we  label  “Role”  has  two  factors,  “Command”  (the  TAO  and  CO)  versus  “Support”  (the  EWS, 
IDS,  TIC,  and  AAWC).  The  Command  role  in  the  Control  condition  had  the  same 
responsibilities  as  the  Command  role  in  the  Experimental  Condition,  the  difference  was  that  they 
did  not  have  the  DSS  to  aid  them.  We  reduced  the  data  in  this  way  so  as  to  examine  Team 
Decision  Efficiency  Scores  emerging  within  the  teams  but  based  upon  the  more  global  “roles.” 
This  analysis  allowed  us  to  assess  whether  the  DSS  differentially  impacted  the  roles  within  the 
teams,  as  well  as  the  other  team  members.  Note  that,  with  this  method,  instead  of  16  teams  for 


analysis,  we  have  a  sample  of  32  because  the  Command  versus  Support  global  roles  represented 
an  additional  between  participant  factor.  With  respect  to  the  formula’s  full  derivation,  because 
we  were  analyzing  the  data  across  the  scenarios,  we  standardized  the  NASA- TLX  values  over  all 
participants  and  scenarios  (6  participants  in  16  total  teams  over  4  scenarios)  .  For  the  team 
performance  scores,  we  standardized  the  relevant  ATPI  scores  over  all  teams  and  scenarios  (16 
total  teams  over  4  scenarios).  The  derived  z-scores  were  then  used  to  calculate  the  TDE  score 
based  upon  the  formula  described  above.  The  mean  TDE  scores  within  the  aforementioned  team 
roles  were  then  calculated  and  used  for  the  subsequent  analyses. 

Procedure 

Control  Condition:  Participants  assembled  and  filled  out  informed  consent  forms. 
Information  packets  were  provided  that  developed  a  context  and  rationale  for  the  research,  and 
then  participants  completed  a  questionnaire  about  their  work  experience.  Based  on  these 
responses,  members  with  the  most  ship  CIC  expertise  were  assigned  as  TAO  and  CO,  and  the 
remaining  team  members  were  assigned  to  the  remaining  watch-stations.  Next,  team  members 
were  trained  on  their  respective  DEETT  watch  stations.  Eirst,  a  training  administrator  gave  an 
introduction  to  CIC  watch  station  responsibilities  and  functions,  and  then  team  members 
practiced  operating  the  watch-stations.  Next,  team  members  participated  in  two  20-minute 
training  scenarios  to  complete  their  familiarization  with  system  functions,  operations,  and  team 
interactions.  Next,  teams  performed  on  each  of  the  four  30-minute  Arabian  Gulf  scenarios. 
Scenario  order  was  counterbalanced.  Prior  to  each  scenario  run  team  members  conducted  a  quick 
pre-brief  to  familiarize  themselves  on  important  scenario  background  information  (e.g., 
geopolitical  situation,  communications  plan,  identification  matrix,  and  rules  of  engagement).  At 
^  For  two  of  the  participants  one  TLX  data  point  (of  his/her  four  possible)  was  missing. 


the  end  of  eaeh  scenario  session  team  members  filled  out  the  NASA  TLX.  Then,  they  used  a 
Scenario  Event  Summary  Sheet  to  guide  their  after  action  review  of  team  performance. 
Following  experiment  completion  participants  were  provided  feedback  on  performance  as  a  way 
to  ensure  they  received  training  value  for  their  efforts. 

Training  and  DSS.  The  experimental  condition  involved  participation  over  two  days.  The 
first  day  participants  filled  out  informed  consent  forms,  and  then  participated  in  the  two  and  one 
half  hour  Decision  Skills  computer-based  training.  The  second  day  team  members  completed  the 
demographics  questionnaire  and,  as  in  the  control  condition,  were  assigned  to  watch-stations 
based  on  experience.  Next,  the  CO  and  TAO  were  trained  in  the  use  of  the  DSS  while  the  other 
team  members  received  DEFTT  familiarization  training.  All  team  members  then  received  the 
Team  Dimensional  Training  computer-based  training  and  videotape,  practiced  Team 
Dimensional  Training  in  the  DEFTT  with  two  training  scenarios,  and  employed  the  DSS  during 
their  after  action  review.  At  the  end  of  training,  teams  were  reminded  they  should  use  a  Scenario 
Event  Summary  Sheet,  DSS,  and  Team  Dimensional  Training  Debriefing  Guide  to  conduct  their 
after  action  reviews.  Following  training  the  same  protocol  was  used  as  in  the  control  condition. 

RESUETS 

We  conducted  a  preliminary  analysis  of  the  team  process  behaviors  in  order  to  document 
Team  Dimensional  Training  intervention  effectiveness.  Our  primary  analysis  concerned  the 
simultaneous  assessment  of  performance  and  workload  using  the  Team  Decision  Efficiency 
scores.  Thus,  the  preliminary  analysis  validated  whether  Team  Dimensional  Training  was 
successful  in  supporting  the  learning  and  implementation  of  team  process  behaviors,  and  the 


primary  analysis  revealed  Team  Deeision  Efficieney  through  comparison  of  the  Training/DSS 
and  Controls  conditions. 


Team  Process  Behaviors.  A  2-way  between-subjects  MANOVA  was  performed  on  the 
four  dependent  teamwork  behavior  variables  (Supporting  Behavior,  Leadership/Initiative, 
Information  Exchange,  and  Communications)  with  one  independent  variable  (Training/DSS 
versus  Control).  Supporting  our  first  hypothesis,  results  showed  a  significant  effect  of  training 
on  team  performance  behaviors,  F  (4,  11)  =  4.74,  p  <  .02.  Associated  univariate  tests  for  the 
training  factor  revealed  a  significant  main  effect  on  Information  Exchange,  F  (I,  14)  =  15.77,  p  < 
.01,  Communications,  F  (I,  14)  =  10.43,  p  <  .01,  Eeadership/Initiative,  F  (I,  14)  =  6.31,  p  <  .03, 
and  marginally  significant  for  Supporting  Behaviors,  F  (1,  14)  =  3.4,  p<  .09.  Figure  1  illustrates 
the  differing  scores  for  these  team  process  behaviors  across  condition. 


Insert  Figure  I  Here 


Team  Decision  Efficiency.  A  2  x  2  x  (2  x  2)  mixed-model,  repeated  measures  ANOVA 
was  run  on  the  Team  Decision  Efficiency  Scores  with  Condition  (Training/DSS  versus  Control) 
and  Role  (Command/Support)  as  the  between  participant  factors,  and  Decision  Task  (DE  versus 
PE)  and  Timing  (On  Time  versus  Fate)  as  the  within  participant  factors.  Estimated  marginal 
means  are  reported  below. 

First,  we  find  a  significant  interaction  between  Condition  and  Role  F  (1,  28)  =  4.29,  p  < 
.05.  Figure  2  shows  the  standardized  Decision  Efficiency  Scores  for  the  interaction  between 
condition  and  role  illustrating  the  larger  difference  between  conditions  for  those  in  the 
“Command”  role.  Specifically,  for  those  team  members  in  the  Command  role,  the  Training/DSS 


condition  produced  positive  efficieney  (M  =  .199)  seores  while  those  in  the  Control  eondition 
within  the  “Command”  role  produeed  negative  seores  (M  =  -.315).  In  the  Support  role,  the 
difference  between  the  Training/DSS  (M  =  -.049)  and  Control  eonditions  was  mueh  less  (M  = 
.081).  What  this  interaetion  suggests  is  that  the  DSS  had  an  impaet  on  workload/performanee, 
but  only  for  those  roles  utilizing  the  DSS.  We  next  look  at  our  within  partieipant  faetors  to 
examine  whether  the  Team  Deeision  Effieieney  seore  varied  dependent  upon  the  nature  of  the 
deeision  task  and  the  timing  of  those  decisions. 


Insert  Figure  2  Here 


For  our  second  effect  we  find  a  signifieant  interaetion  between  Condition  and  Deeision 
Task  F  (1,  28)  =  4.73,  p  <  .05.  Figure  3  shows  the  standardized  Deeision  Effieieney  Seores  for 
this  interaction.  Overall,  the  Training/DSS  eondition  produeed  positive  effieieney  seores  while 
the  Control  eondition  produeed  negative  seores.  But,  on  the  PE  scores  the  differenee  was  greater 
between  the  Training/DSS  (M  =  .115)  and  the  Control  (M  =  -.158)  eonditions.  The  differenee 
for  DE  seores  was  substantially  less  between  the  Training/DSS  (M  =  .035)  and  the  Control  (M  = 
-.076)  eonditions.  Thus,  eollapsed  aeross  on-time  and  late  seores,  while  the  Training/DSS  had  a 
small  impact  on  the  DE  seores,  there  was  a  large  difference  aeross  the  PE  seores,  with  the 
Control  group  showing  a  negative  seore  and  the  Experimental  group  showing  a  positive  seore. 
This  suggests  that  the  teams  with  the  Training/DSS  were  performing  better  on  the  PE  decision 
processes,  but  this  did  not  eome  at  a  eost  of  higher  workload  (i.e.,  they  performed  better  while 
reporting  relatively  lower  workload). 


Insert  Figure  3  Here 


There  was  not  a  signifieant  interaction  between  Condition  and  Timing  F  (1,  28)  =  2.43,  g 
<  .15.  Although  this  interaction  was  not  significant,  we  see  that,  for  the  on-time  scores,  both  the 
Training/DSS  (M  =  .012)  and  the  Control  (M  =  -.054)  conditions  are  near  zero  indicating 
relatively  equal  workload  and  performance.  Figure  4  shows  the  standardized  Decision 
Efficiency  Scores  between  condition  and  timing  of  decision.  The  difference  in  the  late  scores 
was  substantially  larger  between  the  Training/DSS  (M  =  .138)  and  the  Control  (M  =  -.180) 
conditions.  Thus,  collapsed  across  DE  and  PE  scores,  while  the  Training/DSS  condition  had  little 
impact  for  the  on-time  scores,  there  was  a  large  difference  across  the  late  scores,  with  the  control 
group  showing  a  negative  score,  and  the  experimental  group  showing  a  positive  score.  Post-hoc 
analysis  showed  that  this  difference  was  significant,  t(30)  =  1.9,  p  <  .05,  one-tailed.  This 
suggests  that  the  Training/DSS  teams  were  basically  performing  more  deliberately  (i.e.,  taking 
more  time)  but  better,  and  this  deliberation  did  not  come  at  a  cost  of  higher  workload  (i.e.,  they 
performed  better  while  reporting  relatively  lower  workload). 


Insert  Eigure  4  Here 


East,  we  find  a  significant  3-way  interaction  between  Condition,  Timing,  and  Decision 
Task  F  (1,  28)  =  4.16,  p  =  .05.  Figure  5  shows  the  standardized  Decision  Efficiency  Scores  for 
this  interaction.  Across  the  majority  of  the  decisions,  we  see  slightly  positive  or  negative  scores 
indicating  relatively  equal  levels  of  workload  and  performance.  Consistently,  across  these  scores, 
we  see  the  control  group  showing  negative  scores  and  the  experimental  group  showing  positive 


scores.  Further,  the  largest  differenee  aeross  eonditions  was  for  the  PE  late  seores,  with  these 
seores  in  favor  of  the  Training/DSS.  Thus,  mirroring  the  prior  interaetions,  this  shows  that  the 
largest  impaet  for  the  Training/DSS  teams  oeeurred  in  the  late  seores,  but  in  this  ease,  primarily 
for  the  PE  deeision  proeesses. 


Insert  Eigure  5  Here 


DISCUSSION 

In  this  paper  the  prineiple  of  instruetional  effieieney  was  expanded  to  eneompass 
analyzing  how  training  and  DSS  influenee  proeess  and  performanee  for  taotieal  teams  -  what  we 
termed  the  Team  Deeision  Effieieney  seore.  Eollowing  Eiore  et  al.  (in  press)  we  tested  a  portion 
of  a  framework  developed  to  devise  new  strategies  for  assessing  human  systems  integration.  The 
goal  of  this  line  of  inquiry  is  to  demonstrate  how  theoretieally  sound  eonstruets  and  measurement 
teehniques  from  domains  outside  the  military  seienees  ean  aid  in  our  diagnoses  of  team  proeesses 
when  teehnology  is  designed  as  a  performanee  aid.  Overall,  we  found  that  the  Team  Deeision 
Effieieney  seore  was  sensitive  to  the  TADMUS  Training/DSS  intervention.  Speoifieally,  we  see 
that  ineorporating  a  DSS  into  team  proeesses  ean  have  a  differential  impaet  on  team  deeision 
effieieney  suggesting  potential  benefits  to  proeess  and  performanee.  This  differenee  manifests 
itself  to  a  greater  extent  on  seores  related  to  planning  and  exeeution  deeision  proeesses. 

Our  rationale  for  this  metric  was  that  combining  individual  mental  effort  scores  with 
overall  team  performanee  seores  ean  be  indieative  of  the  effeetiveness  of  training  and  systems 
interventions.  By  simultaneously  eonsidering  individual  measures  of  workload  aeross  multiple 
seenarios  in  eonjunetion  with  team  performanee  we  were  able  to  illustrate  how  interventions 


reduced  relative  workload.  The  positive  Team  Decision  Efficiency  scores  suggest  that  the 
Training/DSS  resulted  in  less  cognitive  demand  and  better  performance. 

These  analytical  techniques  are  important  because  they  allow  us  to  determine  the  relative 
effectiveness  of  technology-enabled  team  processes,  thereby  identifying  differing  forms  of 
improvement  techniques  for  either  design  or  training  remediation.  Specifically,  rather  than  just 
noting  performance  was  low,  measures  of  efficiency  allow  us  to  determine  where  perceptions  of 
workload  are  high  versus  low  (see  Cuevas,  Fiore,  &  Oser,  2002).  What  we  suggest  is  that  this 
efficiency  score  can  serve  to  identify  human  performance  improvements  in,  and  problems  with, 
new  training  strategies  and  decision  aiding  systems.  In  particular,  with  evidence-based  training 
and  aiding  systems,  the  efficiency  score  can  serve  to  identify  training  remediation  strategies.  For 
example,  team  members  reporting  low  workload  and  performing  poorly  may  require  a  different 
form  of  feedback  in  their  after  action  review  (i.e.,  need  to  improve  teamwork  processes)  than 
teams  performing  poorly,  but  reporting  high  workload  (i.e.,  need  to  improve  use  of  decision 
aiding  system).  As  such,  leveraging  metrics  from  differing  fields  such  as  the  instructional 
sciences  allow  us  to  produce  diagnostic  techniques  to  improve  the  way  human-systems 
integration  is  tested  in  general,  and  how  feedback  is  delivered  and  used  in  particular. 

In  conclusion,  and  from  a  broader  perspective,  applying  such  cognitive  theories  as  CFT 
(see  Cuevas,  Fiore,  Bowers,  &  Salas,  2004)  to  designing  measures  of  human  performance  serves 
two  related  goals.  First,  from  the  theoretical  level,  it  moves  us  closer  to  understanding  and  better 
diagnosing  processes  related  to  team  cognition  (Salas  &  Fiore,  2004).  Second,  from  the  practical 
level,  it  helps  us  in  our  efforts  to  transform  the  state  of  military  training  and  decision  aiding 
systems.  In  this  paper  we  demonstrated  how  the  Team  Decision  Efficiency  measure  can  assess 
the  combined  effect  of  training  and  decision  aiding.  In  support  of  analogous  theorizing  coming 


out  of  the  instructional  sciences,  these  techniques  “can  reveal  important  information  about  the 
cognitive  consequences  of  instructional  conditions  that  is  not  necessarily  reflected  by  traditional 
performance-based  measures”  (p.  134,  Paas  &  Tuovinen,  2004).  Using  diagnostic  measurement 
methods  can  support  identifying  ways  to  reduce  extrinsic  cognitive  load,  thereby  facilitating  the 
return  on  investment  in  human  systems  integration  design  and  development. 
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Figure  2.  Standardized  Team  Deeision  Effieieney  Seores  for  the  Interaetion  between 
Condition  and  Role. 
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Figure  3.  Standardized  Deeision  Efficiency  Scores  for  the  interaction  between  Condition 
and  Decision  Type. 
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Figure  4.  Standardized  Deeision  Effieieney  Seores  for  the  Interaetion  between  Condition 
and  Timing  of  Deeision. 
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Figure  5.  Standardized  Deeision  Effieieney  Scores  for  the  interaction  between  Condition, 
Timing,  and  Decision  Task. 
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